A Domain Adaptive Density Clustering Algorithm for Data With Varying Density Distribution

نویسندگان

چکیده

As one type of efficient unsupervised learning methods, clustering algorithms have been widely used in data mining and knowledge discovery with noticeable advantages. However, based on density peak limited effect varying distribution (VDD), equilibrium (ED), multiple domain-density maximums (MDDM), leading to the problems sparse cluster loss fragmentation. To address these problems, we propose a Domain-Adaptive Density Clustering (DADC) algorithm, which consists three steps: domain-adaptive measurement, center self-identification, self-ensemble. For VDD features, clusters regions are often neglected by using uniform thresholds, results clusters. We define measurement method K K-Nearest Neighbors (KNN) adaptively detect peaks different regions. treat each point its KNN neighborhood as subgroup better reflect domain view. In addition, for ED or MDDM large number similar values can be identified, self-identification self-ensemble automatically extract initial centers merge fragmented Experimental demonstrate that compared other comparative algorithms, proposed DADC algorithm obtain more reasonable VDD, features. Benefitting from few parameter requirement non-iterative nature, achieves low computational complexity is suitable large-scale clustering.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Density Based Distribute Data Stream Clustering Algorithm

To solve the problem of distributed data streams clustering, the algorithm DB-DDSC (Density-Based Distribute Data Stream Clustering) was proposed. The algorithm consisted of two stages. First presented the concept of circular-point based on the representative points and designed the iterative algorithm to find the densityconnected circular-points, then generated the local model at the remote si...

متن کامل

Improvement of density-based clustering algorithm using modifying the density definitions and input parameter

Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...

متن کامل

Density Adaptive Parallel Clustering

In this paper we are going to introduce a new nearest neighbours based approach to clustering, and compare it with previous solutions; the resulting algorithm, which takes inspiration from both DBscan and minimum spanning tree approaches, is deterministic but proves simpler, faster and doesn’t require to set in advance a value for k, the number of clusters.

متن کامل

Grid Density Clustering Algorithm

Data mining is the method of finding the useful information in huge data repositories. Clustering is the significant task of the data mining. It is an unsupervised learning task. Similar data items are grouped together to form clusters. These days the clustering plays a major role in every day-to-day application. In this paper, the field of KDD i.e. Knowledge Discovery in Databases, Data mining...

متن کامل

Density-Based Clustering Based on Probability Distribution for Uncertain Data

Today we have seen so much digital uncertain data produced. Handling of this uncertain data is very difficult. Commonly, the distance between these uncertain object descriptions are expressed by one numerical distance value. Clustering on uncertain data is one of the essential and challenging tasks in mining uncertain data. The previous methods extend partitioning clustering methods like k-mean...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2021

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2019.2954133